A versatile parallel block-tridiagonal solver for spectral codes

نویسندگان

  • Jungpyo Lee
  • John C. Wright
چکیده

Three-dimensional (3-D) processor configuration of a parallel solver is introduced to solve a massive block-tridiagonal matrix system in this paper. The purpose of the added parallelization dimension is to retard the saturation of the scaling due to communication overhead and an inefficient parallelization. The semi-empirical formula for the matrix operation count of the typical parallel algorithms is estimated including the saturation effect in 3-D processor grid. As the most suitable algorithm, the combined method of “Divide-and-Conquer” and “Cyclic Odd-Even Reduction” is implemented in a MPI-Fortran90 based numerical code named TORIC. The new 3-D parallel solver of TORIC using thousands of processors shows about 4 times improved computation speed at the optimized 3-D grid than the old 2-D parallel solver in the same condition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A block-tridiagonal solver with two-level parallelization for finite element-spectral codes

Two-level parallelization is introduced to solve a massive block-tridiagonal matrix system. One-level is used for distributing blocks whose size is as large as the number of block rows due to the spectral basis, and the other level is used for parallelizing in the block row dimension. The purpose of the added parallelization dimension is to retard the saturation of the scaling due to communicat...

متن کامل

GPGPU parallel algorithms for structured-grid CFD codes

A new high-performance general-purpose graphics processing unit (GPGPU) computational fluid dynamics (CFD) library is introduced for use with structured-grid CFD algorithms. A novel set of parallel tridiagonal matrix solvers, implemented in CUDA, is included for use with structured-grid CFD algorithms. The solver library supports both scalar and block-tridiagonal matrices suitable for approxima...

متن کامل

BCYCLIC: A parallel block tridiagonal matrix cyclic solver

A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved us...

متن کامل

Alternating-Direction Line-Relaxation Methods on Multicomputers

We study the multicomputer performance of a three-dimensional Navier-Stokes solver based on alternating-direction line-relaxation methods. We compare several multicomputer implementations, each of which combines a particular line-relaxation method and a particular distributed block-tridiagonal solver. In our experiments, the problem size was determined by resolution requirements of the applicat...

متن کامل

Implementation of a Fully - Balancedperiodic Tridiagonal Solver on Aparallel

While parallel computers ooer signiicant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem|a periodic tridiagonal solver|are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate these strategies. The particular tridiagonal solver evaluated is us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010